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1.  INTRODUCTION 

The  project,  entitled  "Neural  Scene  Segmentation  By  Oscillatory  Con’elation"  (N00014-96-1- 
0676),  was  awarded  as  a  three-year  YIP  grant  in  May  1996,  and  it  ended  in  September  1999.  The 
total  project  budget  was  $330,501.  The  goal  of  the  project  was  to  develop  the  oscillatory 
correlation  approach  for  image  segmentation,  whereby  the  binding  of  pixels  is  encoded  by  phases 
of  neural  oscillators.  This  project  was  very  productive,  and  led  to  major  accomplishments. 

In  this  final  project  report,  I  summarize  the  progress  made  during  this  project.  Section  2 
provides  an  overview  of  scientific  progress,  and  Section  3  gives  more  description  of  some  major 
accomplishments.  Section  4  provides  a  list  of  scientific  publications  resulting  from  the  grant,  and 
these  publications  contain  full  details  of  the  progress.  Finally,  Section  5  provides  a  list  of  Ph.D. 
dissertations  resulting  from  this  grant. 


2.  OVERVIEW  OF  SCIENTIFIC  PROGRESS 

Prior  to  this  project,  our  work  on  the  oscillatory  correlation  approach  resulted  in  general 
LEGION  architecture  for  scene  segmentation.  LEGION  stands  for  Locally  Excitatory  Globally 
Inhibitory  Oscillator  Networks,  and  builds  on  relaxation  oscillators.  With  the  introduction  of  a 
lateral  potential  to  each  oscillator,  a  solution  to  remove  noisy  regions  in  a  scene  was  proposed  for 
LEGION  so  that  it  suppresses  the  oscillators  corresponding  to  noisy  and  insignificant  regions, 
without  affecting  those  corresponding  to  significant  ones.  We  have  analytically  shown  that  the 
resulting  oscillator  network  separates  a  scene  into  several  major  regions,  plus  a  background 
consisting  of  all  noisy  ones.  We  have  found  a  fast  numerical  method  -  the  singular  limit  method  - 
for  integrating  relaxation  oscillator  networks,  and  obtained  extensive  results  on  analyzing  time 
complexity  of  computing  using  oscillatory  correlation.  Also,  we  have  found  that  relaxation 
oscillators  show  a  wide  spectrum  of  behavior  with  parameter  adjustment,  ranging  from  integrate- 
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and-fire  oscillators  to  sinusoidal  oscillators,  and  LEGION  architecture  can  be  extended  to  integrate- 
and-fire  oscillators.  Recently,  on  the  basis  of  oscillatory  con-elation,  we  have  solved  the  long¬ 
standing  Minsky-Papert  connectedness  problem,  which  is  the  problem  of  detecting  whether  an 
arbitrary  figure  is  connected  by  a  neural  network. 

Although  the  above  progress  is  largely  on  theoretical  aspects  of  oscillatory  dynamics,  the  thrust 
of  the  work  conducted  during  this  project  was  to  apply  the  oscillatory  correlation  approach  to 
image  segmentation.  The  types  of  imagery  we  have  successfully  dealt  with  are  gray-level  medical 
and  aerial  images,  range  (depth)  images,  texture  images,  and  motion  images  (image  sequences). 
On  medical  images,  we  have  obtained  excellent  results  in  segmenting  anatomical  structures  of  the 
brain  from  CT  and  MRI  imagery.  On  aerial  images,  we  have  proposed  a  new  methodology  that 
combines  neural  network  learning  for  seed  selection,  weight  adaptation  for  noise  removal  and 
oscillatory  correlation.  The  resulting  method  has  been  applied  to  segmentation  and  object 
extraction  from  large-scale  aerial  imager,  and  extensive  comparisons  show  that  our  results  are 
significantly  better  than  those  obtained  by  other  methods.  On  range  image  segmentation,  we  have 
proposed  a  method  for  local  range  detection  that  combines  depth,  surface  normal,  and  mean  and 
Gaussian  curvatures.  On  texture  images,  we  have  proposed  to  use  Gaussian  Markov  Random 
Fields  as  an  effective  way  of  texture  feature  extraction,  and  applied  the  singular  limit  method  for 
integrating  a  large  system  of  differential  equations.  The  resulting  system  has  been  tested  on  texture 
images,  and  has  been  favorably  compared  with  other  algorithmic  systems  for  texture  image 
segmentation.  On  motion-based  segmetation,  our  methodology  integrates  motion  and  brightness 
for  analyzing  image  sequences,  and  a  subsequent  network  combines  the  two  analyses  to  refine 
local  motion  estimates.  Again,  the  resulting  system  has  been  successfully  evaluated  with  real 
image  sequences,  and  compared  with  other  algorithms  for  motion  analysis  and  segmentation. 

Additionally,  we  have  proposed  a  new  architecture  for  object  selection  -  the  task  of  selecting 
target  objects  in  scenes.  Our  selection  network  builds  on  LEGION  dynamics  and  slow  inhibition, 
and  has  been  applied  to  select  the  most  salient  object  in  gray-level  images.  Very  recently,  we  have 
completed  a  study  that  integrates  a  primitive  segmentation  stage  with  a  model  of  associative 
memory.  The  integrated  system  is  evaluated  with  a  systematic  set  of  3-D  line  drawing  objects,  and 
memory-based  organization  is  responsible  for  a  large  improvement  in  performance. 


3.  DESCRIPTION  OF  SELECTED  WORKS 

3.1  Synchronization  and  Desynchronization  in  Large  Oscillator  Networks 

A  long-standing  problem  in  neural  computation  has  been  the  problem  of  connectedness,  first 
identified  by  Minsky  and  Papert  in  their  1969  landmark  book  on  perceptrons,  which  is  the  problem 
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of  detecting  whether  an  arbitrary  figure  is  connected  by  a  neural  network.  This  problem  served  as 
the  cornerstone  for  them  to  analytically  establish  that  perceptrons  are  fundamentally  limited  in 
computing  geometrical  (topological)  properties.  Despite  the  breakthrough  made  in  training 
multilayer  networks  in  the  mid-eighties,  which  led  to  a  remarkable  resurgence  in  neural  network 
research,  the  Minsky-Papert  problem  remained  unsolved  as  stated  in  their  expanded  1988  edition 
of  the  1969  book.  We  have  recently  solved  this  problem  by  employing  a  different  class  of  neural 
networks  -  oscillator  networks.  To  solve  the  problem,  the  representation  of  oscillatory  correlation 
is  employed,  and  it  emerges  from  a  LEGION  network.  It  is  further  shown  that  these  oscillator 
networks  exhibit  sensitivity  to  topological  structure,  which  may  lay  a  neurocomputational 
foundation  for  explaining  the  psychophysical  phenomenon  of  topological  perception. 

Our  solution  is  published  a  few  weeks  ago  in  Neural  Computation  (Wang,  2000;  see  journal 
publications  listed  in  Sect.  4).  Figure  1  illustrates  our  solution  by  showing  the  response  of  a 
30x30  LEGION  network  to  two  figures:  one  connected  and  one  disconnected.  The  connected 
figure  is  a  "cup"  shown  in  Fig.  lA,  while  the  disconnected  one  is  the  image  of  the  word  "CUP" 
shown  in  Fig.  IB.  The  LEGION  network  is  solved  using  a  Runge-Kutta  method.  The  oscillators 
of  the  network  start  with  random  phases.  Fig.  1C  displays  temporal  activity  of  all  the  stimulated 
oscillators  for  the  connected  cup  image.  Unstimulated  oscillators  are  omitted  from  the  display 
because  they  do  not  oscillate.  The  oscillators  corresponding  to  each  pattern  are  combined  in  the 
display,  and  thus  appear  like  a  single  oscillator  when  they  are  in  synchrony.  The  upper  panel 
shows  the  oscillator  block  corresponding  to  the  cup,  and  the  middle  panel  shows  the  activity  of  the 
global  inhibitor.  Synchrony  occurs  in  the  first  cycle  of  oscillations.  The  case  for  the  disconnected 
"CUP"  is  shown  in  Fig.  ID.  The  upper  three  traces  in  Fig.  ID  show  the  three  blocks 
coiresponding  to  the  three  patterns,  respectively,  and  the  fourth  one  the  activity  of  the  global 
inhibitor.  The  bottom  traces  in  both  Fig.  1C  and  Fig.  ID  show  the  response  of  a  connectedness 
predicate  for  these  two  eases,  where  0  is  a  threshold.  Beyond  a  short  beginning  duration 
corresponding  to  the  process  of  synchronization  and  desynchronization,  the  predicate  correctly 
reveals  connectedness. 
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Inhibitor 


3.2.  Medical  Image  Segmentation 

Advances  in  visualization  technology  and  specialized  graphic  workstations  allow  clinicians  to 
virtually  interact  with  anatomical  structures  contained  within  sampled  medical  image  datasets.  A 
hindrance  to  the  effective  use  of  this  technology  is  the  difficult  problem  of  image  segmentation. 
We  have  studied  LEGION  networks  for  grouping  similar  features  and  segregating  dissimilar  ones 
in  medical  imagery.  We  have  extracted  an  algorithm  from  LEGION  dynamics  and  proposed  an 
adaptive  scheme  for  grouping,  and  applied  the  algorithm  to  2D  and  3D  (volume)  CT  and  MRI 
medical  image  datasets.  In  addition,  we  have  compared  our  algorithm  with  other  algorithms, 
including  active  contours,  learning  vector  quantization,  and  Markov/Gibbs  random  field  models, 
for  medical  image  segmentation,  as  well  as  with  manual  segmentation.  The  comparisons  suggest 
that  LEGION  is  an  effective  computational  framework  to  tackle  the  problem  of  medical  image 
segmentation. 

The  results  are  published  in  IEEE  Trans,  on  Medical  Imaging  (Shareef,  Wang,  and  Yagel, 
1999).  Figure  2  illustrates  the  performance  of  our  system.  Top  left  shows  a  2D  MRI  image.  Top 
right  gives  a  color  map  showing  the  result  of  segmenting  the  image  by  a  LEGION  network. 
Different  segments  are  indicated  by  different  colors.  All  of  the  major  anatomical  regions  are 
correctly  segmented.  The  rest  of  the  figure  shows  the  segmentation  of  a  3D  MRI  volume  dataset 
for  extracting  the  brain.  The  segmentation  results  are  displayed  using  volume  rendering.  The 
middle  row  shows  the  results  of  a  top  view  (the  front  facing  downward)  and  the  bottom  row 
shows  a  side  view  (the  front  facing  leftward),  respectively.  The  two  left  images  show  the  results 
of  our  system,  and  the  two  right  ones  show  the  corresponding  results  produced  by  slice-by-slice 
manual  segmentation.  Although  the  results  of  manual  segmentation  fit  well  with  the  stereotype  of 
our  anatomical  knowledge,  the  results  of  our  algorithm  actually  better  reflect  details  of  this  dataset. 
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3.3  Aerial  Image  Analysis 

To  deal  with  aerial  image  segmentation  and  object  extraction,  we  have  proposed  a  weight 
adaptation  method  during  segmentation,  which  plays  the  roles  of  noise  removal  and  feature 
extraction.  In  particular,  our  weight  adaptation  scheme  is  insensitive  to  termination  times  -  a 
common  problem  in  various  smoothing  techniques  in  image  processing  -  and  the  resulting  dynamic 
weights  in  a  wide  range  of  iterations  are  applicable  to  achieve  the  same  segmentation  results.  The 
resulting  segmentation  method  combines  weight  adaptation  and  oscillatory  correlation.  For  a 
variety  of  large-scale  aerial  images  provided  by  the  U.S.  Geological  Survey  (USGS)  through  the 
Ohio  State  University  Center  for  Mapping,  our  algorithm  achieves  very  good  segmentation  results 
and  yields  favorable  comparisons  with  other  recent  image  processing  algorithms,  including 
nonlinear  smoothing  and  multi-scale  segmentation. 

The  results  are  summarized  in  two  papers  to  appear  in  IEEE  Trans,  on  Neural  Networks 
(Chen,  Wang,  and  Liu,  2000)  and  IEEE  Trans,  on  GeoScience  and  Remote  Sensing  (Liu,  Chen, 
and  Wang,  2000),  respectively.  Figure  3  illustrates  the  results  of  extracting  hydrographic  objects 
from  two  satellite  images.  The  original  images  containing  water  bodies  are  shown  in  the  top  row. 
The  middle  row  shows  the  corresponding  extraction  results.  To  facilitate  comparisons,  we  display 
the  water  bodies  by  marking  them  as  white  and  superimposing  them  on  the  original  images.  The 
bottom  row  provides  the  corresponding  USGS  1:24,000  topographic  maps.  Our  algorithm  extracts 
the  water  bodies  precisely,  even  along  narrow  river  branches.  Moreover,  important  details  are 
preserved,  such  as  the  small  island  near  the  uppermost  river  branch  in  the  upper  left  image.  A 
careful  comparison  between  the  extracted  regions  and  the  maps  indicate  that  the  former  portray  the 
images  even  a  little  better,  because  stationary  maps  do  not  reflect  well  the  changing  nature  of 
geography. 

Figure  4A  shows  a  very  large  image  (6204x7676)  from  the  Washington  East,  D.C. -Maryland 
area.  Figure  4B  shows  the  result  of  hydrographic  object  (river  in  this  case)  extraction  by  our 
system.  For  comparison.  Figure  4C  shows  the  corresponding  result  by  a  multilayer  perceptron. 
The  perceptron  is  first  trained  using  typical  samples  from  both  hydrographic  and  non-hydrographic 
regions,  and  is  then  applied  to  classify  the  entire  image.  It  is  clear  from  Figure  4  that  our  system 
performs  large-scale  hydrographic  extraction  with  high  accuracy,  and  does  a  much  better  job  than 
classification  by  a  multilayer  perceptron. 
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3.4  Texture  segmentation 

Textuie  is  important  for  visual  analysis.  Segmentation  based  on  texture,  however,  has  proven 
to  be  a  very  difficult  task  in  machine  vision.  Many  methods  have  been  proposed  to  deal  with  the 
problem,  including  statistical,  geometrical,  and  model-based  methods.  Our  proposed  method 
consists  of  two  parts.  The  first  part  determines  a  set  of  texture  features  with  a  novel  method 
inspired  by  Gaussian  Markov  Random  Fields  (GMRF).  Unlike  other  GMRF-based  methods,  ours 
is  a  generic  formulation,  not  limited  by  a  fixed  set  of  texture  types.  The  second  part  is  a  two- 
dimensional  LEGION  network.  The  coupling  strengths  between  neighboring  oscillators  are 
deteiTTiined  by  texture  feature  differences.  In  our  simulations,  a  large  system  of  differential 
equations  is  solved  using  the  singular  limit  method.  A  careful  comparison  with  other  methods 
shows  that  our  results  are  at  least  as  good  as  those  commonly  used  in  computer  vision.  Also,  our 
approach  offers  several  methodological  advantages:  the  assumptions  embedded  in  our  method  are 
weaker  than  other  methods  (MRF  for  example)  and  our  method  tends  to  work  satisfactorily  for 
novel  texture  types. 

As  an  illustration.  Figure  5  shows  some  of  our  segmentation  results.  The  left  side  displays  five 
original  images  taken  from  the  commonly  used  Brodatz  Album.  The  right  side  shows  the 
corresponding  results  of  LEGION  segmentation,  where  different  gray  levels  indicate  different 
segments  and  black  scattered  areas  indicate  the  background  resulting  from  LEGION  segmentation. 

Given  a  distinct  texture  (a  giraffe  in  this  example),  a  target  can  be  effectively  extracted  from  a 
cluttered  scene.  This  task  of  target  extraction  based  on  natural  texture  is  illustrated  in  Figure  6. 
The  top  is  an  input  image  that  contains  a  giraffe  and  the  bottom  is  the  result  of  our  texture  target 
extraction.  The  extracted  target  is  embedded  in  the  original  by  lowering  the  intensity  of  those 
pixels  that  do  not  belong  to  the  target. 
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Figure  6 
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3.5  Motion  segmentation 

This  part  of  the  project  concerns  analysis  and  segmentation  of  images  based  on  motion.  Unlike 
many  algorithms  for  motion-based  segmentation,  our  system  integrates  motion  and  brightness  for 
analyzing  image  sequences.  We  have  proposed  two  parallel  pathways  that  first  process  motion  and 
brightness  separately,  and  then  a  subsequent  network  combines  the  two  to  refine  local  motion 
estimates.  Like  other  segmentation  tasks  we  have  dealt  with,  LEGION  networks  are  employed  for 
final  grouping  and  segmentation.  In  addition  to  successful  evaluation  with  real  image  sequences, 
our  system  exhibits  a  number  of  important  properties  in  human  motion  perception;  these  include 
motion  transparency  and  an  elegant  treatment  of  the  so-called  blank  wall  problem,  which  refers  to 
our  ability  to  perceive  a  moving  whole  despite  no  local  motion  signal  in  the  interior  of  the  whole. 

The  results  are  summarized  in  a  paper  to  appear  in  IEEE  Trans,  on  Neural  Networks  (Cesmeli 
and  Wang,  2000).  Figure  7  illustrates  the  performance  of  our  method  on  real  moving  scenes.  Fig. 
7A  shows  one  frame  of  the  input  sequence,  where  a  motorcycle  rider  jumps  to  a  (dry)  canal  with 
his  motorcycle  while  the  camera  is  tracking  him.  Due  to  the  camera  motion,  the  rider  and  his 
motorcycle  have  a  downward  motion  with  a  small  rightward  component  and  the  image  background 
has  an  upright  diagonal  motion.  Fig.  7B  shows  motion  estimates  after  integrating  motion  and 
brightness  analyses,  and  our  estimated  optic  flow  is  largely  correct.  Based  on  these  estimates,  the 
rider  with  his  motorcycle  is  accurately  segmented  from  the  image  background  as  depicted  in  Figure 
1C.  As  in  the  segment  of  the  rider  and  his  motorcycle,  regions  with  different  texture  and 
brightness  are  grouped  into  a  single  segment  due  to  common  motion. 

We  have  compared  our  neural  oscillator  model  with  three  representative  algorithms  proposed 
by  Horn  and  Schunck  (1981),  Anandan  (1987),  and  Black  (1996),  respectively,  for  the  scene 
given  in  Figure  7A.  The  results  of  the  algorithms  of  Horn  and  Schunck,  Anandan,  and  Black  are 
given  in  Figure  7D,  E,  and  F,  respectively.  The  Horn  and  Schunck  algorithm  cannot  capture  the 
motion  of  the  regions  (Fig.  7D),  and  similarly,  the  Anandan  algorithm  cannot  accurately  localize 
the  motion  boundaries  (Fig.  7E).  The  Black  algorithm  offers  the  best  result  by  far  among  the  three 
(Fig.  7F).  Unlike  ours,  the  Black  algorithm  requires  the  number  of  regions  as  an  input  parameter. 
When  two  regions  are  assumed  in  this  case,  it  can  group  the  locations  into  two  segments  that  best 
account  for  the  motion  distribution  in  the  scene.  However,  local  motions  are  not  estimated 
accurately  and  the  motion  boundary  between  the  rider/motorcycle  segment  and  the  image 
background  is  not  well  localized.  Except  for  the  limitation  that  our  model  does  not  produce  motion 
estimates  near  the  image  border  (see  Fig.  7B),  our  neural  network  method  yields  the  most  accurate 
motion  boundaries. 
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